Mathematical Methods for Supervised Learning

نویسندگان

  • Ronald DeVore
  • Gerard Kerkyacharian
  • Dominique Picard
  • Vladimir Temlyakov
چکیده

Let ρ be an unknown Borel measure defined on the space Z := X × Y with X ⊂ IR and Y = [−M,M ]. Given a set z ofm samples zi = (xi, yi) drawn according to ρ, the problem of estimating a regression function fρ using these samples is considered. The main focus is to understand what is the rate of approximation, measured either in expectation or probability, that can be obtained under a given prior fρ ∈ Θ, i.e. under the assumption that fρ is in the set Θ, and what are possible algorithms for obtaining optimal or semi-optimal (up to logarithms) results. The optimal rate of decay in terms of m is established for many priors given either in terms of smoothness of fρ or its rate of approximation measured in one of several ways. This optimal rate is determined by two types of results. Upper bounds are established using various tools in approximation such as entropy, widths, and linear and nonlinear approximation. Lower bounds are proved using KullbackLeibler information together with Fano inequalities and a certain type of entropy. A distinction is drawn between algorithms which employ knowledge of the prior in the construction of the estimator and those that do not. Algorithms of the second type which are universally optimal for a certain range of priors are given.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extensions of Gaussian Processes for Ranking: Semi-supervised and Active Learning

Unlabelled examples in supervised learning tasks can be optimally exploited using semi-supervised methods and active learning. We focus on ranking learning from pairwise instance preference to discuss these important extensions, semi-supervised learning and active learning, in the probabilistic framework of Gaussian processes. Numerical experiments demonstrate the capacities of these techniques.

متن کامل

CS540 Machine Learning Clustering of Typeset Mathematical Symbols Using Spectral Methods and Shape Contexts

Optical character recognition (OCR) of natural languages, both typeset and handwritten, is successfully used today in a wide range of applications. OCR of mathematical expressions and mathematical symbols is not yet as advanced, however. This project demonstrates a method for recognising typeset mathematical symbols. The method involves using spectral methods to perform semi-supervised clusteri...

متن کامل

Supervised classification and mathematical optimization

Data Mining techniques often ask for the resolution of optimization problems. Supervised Classification, and, in particular, Support Vector Machines, can be seen as a paradigmatic instance. In this paper, some links between Mathematical Optimization methods and Supervised Classification are emphasized. It is shown that many different areas of Mathematical Optimization play a central role in off...

متن کامل

Information Theoretic Modeling of Dynamical Systems: Estimation and Experimental Design @bullet Free Software Foundation Europe @bullet Free Software Foundation @bullet European Science Foundation

Dynamical systems are mathematical models expressing cause-e ect relations of time-varying phenomena. This thesis focuses on learning dynamical systems from empirical observations. Three settings are considered: unsupervised, supervised, and active learning. The unifying goal is to extract predictive information from data. A method is introduced to cluster time-series and perform model validati...

متن کامل

Feature Learning Based Deep Supervised Hashing with Pairwise Labels

Recent years have witnessed wide application of hashing for large-scale image retrieval. However, most existing hashing methods are based on hand-crafted features which might not be optimally compatible with the hashing procedure. Recently, deep hashing methods have been proposed to perform simultaneous feature learning and hashcode learning with deep neural networks, which have shown better pe...

متن کامل

Semi-Supervised Learning of Concatenative Morphology

We consider morphology learning in a semi-supervised setting, where a small set of linguistic gold standard analyses is available. We extend Morfessor Baseline, which is a method for unsupervised morphological segmentation, to this task. We show that known linguistic segmentations can be exploited by adding them into the data likelihood function and optimizing separate weights for unlabeled and...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005